Skip to main content

BigQuery Data Type Update

The workflow Env - BigQuery Data Type Update in functions repo is used to update the data type of a specific field on root level in BigQuery tables (_latest and _changelog tables) in the bqDataLake dataset. The workflow is triggered manually via the workflow_dispatch event with user-provided inputs from github actions.


📋 Workflow Overview​

Trigger​

Manually triggered using workflow_dispatch.

Input Parameters​

Input NameDescriptionRequiredDefault
table_nameName of the base table✅ Yes""
field_nameName of the field to be updated✅ Yes""
data_typeTarget BigQuery data type✅ Yes""

🧠 Function Logic (updateDataTypeOnRootLevel.ts)​

The script performs the following actions:

  1. Reads environment variables:

    • PROJECT_ENV, TABLE_NAME, FIELD_NAME, DATA_TYPE
  2. Prepares table names:

    • Appends _latest and _changelog to the base table name
  3. Generates and executes ALTER TABLE queries:

    • Updates the column's data type in both tables using BigQuery SQL

TypeScript Code Summary​

const tables = [`${tableName}_latest`, `${tableName}_changelog`];

for (const table of tables) {
const sqlQuery = `
ALTER TABLE ${projectEnv}.bqDataLake.${table}
ALTER COLUMN ${field} SET DATA TYPE ${dataType};`;

await bigQuery.query({ query: sqlQuery });
console.log(`Updated column data type in table ${table}`);
}

Error Handling​

If any required environment variable is missing or the query fails, an error message is logged:

console.log('Error:', error?.errors?.[0]?.message ?? 'Data type update failed.');

✅ Example Usage​

To trigger this workflow, go to the Actions tab in GitHub, select ENV - BigQuery Data Type Update, and provide the required inputs:

  • Table Name: accounts_serviceUnit_enquiry_invoice
  • Field Name: grandTotal.value
  • Data Type: STRING

📌 Notes​

Updating Nested Fields​

📄 Casted Fields Query Example​

The following is an example query that casts nested fields in accounts_serviceUnit_enquiry_invoice_changelog:

CREATE OR REPLACE TABLE bqDataLake.accounts_serviceUnit_enquiry_invoice_changelog
PARTITION BY DATE(document_timestamp) AS
SELECT
* EXCEPT(jobs),
ARRAY(
SELECT AS STRUCT
job.* EXCEPT(priceLine),
(SELECT AS STRUCT
job.priceLine.* EXCEPT(parts,labour),
(SELECT AS STRUCT
job.priceLine.parts.* EXCEPT(partsAndFluids),
ARRAY(
SELECT AS STRUCT
paf.* EXCEPT(unitPrice, quantity),
(SELECT AS STRUCT
paf.unitPrice.* EXCEPT(raw),
CAST(paf.unitPrice.raw AS FLOAT64) AS raw
) AS unitPrice,
(SELECT AS STRUCT
paf.quantity.* EXCEPT(raw),
CAST(paf.quantity.raw AS FLOAT64) AS raw
) AS quantity
FROM UNNEST(job.priceLine.parts.partsAndFluids) AS paf
) AS partsAndFluids
) AS parts,
(SELECT AS STRUCT
job.priceLine.labour.* EXCEPT(manufacturerGoodwill),
(SELECT AS STRUCT
job.priceLine.labour.manufacturerGoodwill.* EXCEPT(raw),
CAST(job.priceLine.labour.manufacturerGoodwill.raw AS FLOAT64) AS raw
) AS manufacturerGoodwill
) AS labour
) AS priceLine
FROM UNNEST(jobs) AS job
) AS jobs
FROM bqDataLake.accounts_serviceUnit_enquiry_invoice_changelog;
Field PathOriginal TypeNew Type
job.priceLine.parts.partsAndFluids.unitPrice.rawINTEGERFLOAT64
job.priceLine.parts.partsAndFluids.quantity.rawINTEGERFLOAT64
job.priceLine.labour.manufacturerGoodwill.rawINTEGERFLOAT64